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Abstract. Fix a large number n and let {x x , y l9 . . . , x n , y„} be 2n points chosen independently from 
some fixed continuous probability distribution on the real line. Each pair {*,, y ( } determines a random 
interval [min(jt jy y f ), max(jt jy y f )]. We examine the structure of the resulting family of intervals, and in 
particular answer the following questions: how large a subcollection of pairwise disjoint intervals can 
one expect to find? And, what is the probability that there is an interval in the family which intersects 
all the others? 

Prelude. Before beginning, we invite the reader to test his intuition on the 
following problem (it won’t be any worse than ours was!). The numbers from 1 to 
2 n, with n large, are paired at random, each pair being regarded as the endpoints 
of a real interval. What is the probability that among these n intervals there is one 
which meets all the others? 

1. Introduction: Random Intervals. In studying any type of combinatorial struc- 
ture it is useful, and sometimes quite illuminating, to have models for “random” 
structures of that type. The most famous example is the random graph model of 
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Erdos and Renyi [2] in which edges are chosen independently with fixed probabil- 
ity. Random graph theory now forms a substantial subject of study in itself. 

The combinatorial structures with which we concern ourselves here are those 
based on intersection properties of families of intervals. The most obvious defini- 
tion of a “random interval” is the interval between two random numbers x and y; 
for convenience we denote that interval by [x, y] even when y < x, so that by 
definition [x, y] = [y, x]. 

We create a family F of random intervals in the following way: fix a number n 
and a continuous probability distribution on the real line. Choose values 
x 1 ,y 1 ,...,x n ,y n independently and let F consist of intervals I u ...,I n where 
h = [*r >’/]• 

Since intersection properties of F depend only on the order of the points 
x u ...,y n , it is clear that the model is insensitive to the choice of distribution. In 
fact, an equivalent discrete model is obtained by choosing at random one of the 
(2n)! assignments of x v ..., y n to the integers from 1 to 2 n. In the continuous 
model the most convenient distribution seems to be the uniform distribution on 
the unit interval [0, 1]; in that case we may think of the intervals as having arisen 
from random points (x ; , y,) in the unit square. 

To any family F of intervals we may associate a graph G = (V, E) as follows: 
the vertices of G correspond to the intervals of F, and two vertices constitute an 
edge of G just when their corresponding intervals have non-null intersection. 
Graphs arising in this fashion are called interval graphs-, when F is a family of 
random intervals as described above, they are called random interval graphs [8], 



Fig. la. Eight random points and their intervals. 
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Fig. lb. Random interval graph and interval order. 


We preserve somewhat more of the properties of a family F of intervals by 
assigning to F a partially ordered set P; here the elements of P correspond to the 
intervals of F with x < y just when the interval corresponding to x lies entirely to 
the left of the interval corresponding to y. Partially ordered sets arising in this 
fashion are called interval orders. Note that G may be obtained from P by 
declaring {x, y} to be an edge of G whenever x and y are incomparable in P. 

For further information about both interval graphs and interval orders, we refer 
the reader to Fishburn’s excellent book [4]. 

Figure la depicts 8 random intervals (arising from 8 points chosen in the unit 
square) and Figure lb depicts the corresponding interval graph and interval 
order. 

2. Chains and Antichains .... or cliques and independent sets. 

Given a collection of n random intervals, what is the size of a largest family of 
pairwise intersecting intervals? ... or pairwise disjoint intervals? In poset language 
we are asking for the sizes of the largest antichain (pairwise incomparable 
elements) and of the largest chain (pairwise comparable, and therefore totally 
ordered, elements). In graph language we seek the sizes of the largest clique 
(pairwise adjacent) and the largest independent set (pairwise nonadjacent). 

The first question (pairwise intersecting) is easier to answer. 

Intervals of reals satisfy the “Helly property/’ that is, every collection of 
pairwise-intersecting intervals has a nonempty intersection. Thus, when endpoints 
are uniformly distributed on [0, 1]* it is enough to find the point x e [0, 1] which is 
contained in the maximum number of intervals in the collection. The probability 
that an interval contains x is 1 — x 2 — (1 — x) 2 = 2x — 2x 2 . This is maximized 
when x = 1/2, and the expected number of intervals which contain 1/2 is n/2. 
With a bit more work (the interested reader can consult [8]) one can obtain the 
following result: 

Theorem 1. Let the random variable A n denote the size of a largest set of pairwise 
intersecting intervals in a family of n random intervals. There exists a function f(n ) 
such that 
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and 

+/(»)} = 

We turn now to the second question: What is the largest collection of pairwise 
non-intersecting intervals (longest chain) in a family of n random intervals? It 
turns out that this question is closely related to the following well-known problem, 
due to Stanislaw Ulam: What is the length of the longest increasing subsequence 
of a random permutation of 1 , . . . , nt 

Ulam’s problem, posed in [10], was tackled by Hammersley in a famous paper 
[6] in which the answer was shown to be asymptotic to cyfn for some constant c. 
Later, a combination of the efforts of Schensted [9], Logan and Shepp [7] and 
Versik and Kerov [11] determined that the constant is, in fact, 2. 

Ulam’s “ random permutation” may be constructed by choosing n random 
points from the uniform distribution in the unit square, and numbering the points 
from left to right (i.e., according to their x-coordinates). The permutation arises in 
reading the label-numbers from bottom to top. 

The points determine a partially ordered set Q under the product ordering, 
where (x f , y t ) <_ Q (x y , y y ) just when x t < x y and y t < y y . An increasing subsequence 
of the permutation corresponds exactly to a chain in Q; geometrically, a subset of 
the points extending from southwest to northeast in the square. Let us denote by 
X n the random variable whose value is the size of the largest such subset. 

As we have seen, the same set of points in the unit square determines a family 
of random intervals, but in the interval order P , we have the stronger condition 
y t ) < p ( Xj , y y ) iff max(x /? y t ) < min(x y , y y ). Let us denote by Y n the size of a 
largest chain in P; it is the asymptotic behavior of Y n which we wish to determine. 
Since every chain of P is a chain of Q, we have Y n <X n . 

For example, consider the point e in Figure la. The points greater than e in Q 
are those to the northeast (namely, b, c, d and h). We can also describe geometri- 
cally the points which lie above e in P. Draw a vertical and horizontal line through 
the point e. Also, draw a vertical and horizontal line through e’s reflection across 
the positively sloped diagonal of the square. These four lines divide the unit square 
into 9 rectangles. (See Figure la.) The upper right-hand square contains the 
points which correspond to intervals to the right of e (namely, d and h). 

Hammersley [6] used the method of subadditivity to prove that X n / y/n ap- 
proaches a constant in probability, and in fact a nearly identical argument can be 
made to achieve the same result for Y n / yfn . (See, for example, Bolobas and 
Winkler [1] where Hammersley’s results are extended to higher dimension.) 

In the case of Ulam’s problem the actual value of c was obtained only with 
great difficulty, and in fact the values of the constants in higher dimensions remain 
unknown. Luckily, a special feature of the interval case allows us both to prove the 
existence of the constant and determine its value with relative ease. The difference 
is that in our interval problem a maximum-length chain can be built in “greedy” 
fashion from the bottom up, whereas the equivalent process in Ulam’s problem 
fails by a constant factor, in the limit, to attain maximal length. 

Theorem 2. Let Y n denote the maximum number of pairwise disjoint intervals in a 
family of n random intervals. Then 



n-+°o yn . V7 T 


in probability. 
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This means for every e > 0, there exists an n 0 so that for all n > n 0 . 


Pr 


A 

yfn 



> 1 - £. 


Proof. Let us establish a Poisson process on the plane of density 1. We select an 
infinite chain C = {(w 1? v 1 ),(u 2 , v 2 ), . . . } of points of the process in the following 
manner: (u l9 vf) is the point in the positive quadrant which minimizes max(w 1? vf), 
and thereafter, ( u k , v k ) is the point above ( u k _ v v k _ 1 ) which minimizes max( u k , v k ). 
In terms of intervals, C represents the chain built from the bottom by always 
selecting the interval with least possible upper endpoint; it is easily seen by 
induction that in any finite collection of intervals such a chain has maximum 
possible length. 

Now, for any positive real 5, the region {(x, y):0 < x, y < s) = [0, 5 1 ] 2 is with 
probability exactly exp{-s 2 } unoccupied by a point of the Poisson process. It 
follows that if S is the random variable whose value is max(x 1? y x \ then the mass 
density of S is given by the function 

f O ) = = 2«"' 2 

whose expected value is 

f 2 s 2 e~ s2 ds = f t l/2 e~‘ dt = T(3/2) = 

J o J o 2 

The differences 


max(u 1 ,y 1 ) - 0,max(w 2 ,y 2 ) - max(w 1 ,y 1 ),max(M 3 , v 3 ) - ma x(u 2 ,v 2 ) , ■ ■ ■ 

are independent and identically distributed with mean Vrr /2. It follows from the 
law of large numbers that for any e > 0, 

iC max(x„,y m ) fa 

(1 “ 6) ~ < m — 


with probability at least 1 — e, for every sufficiently large m. 

Now let r(n ) be the least r such that [0, r] 2 contains exactly n points of the 
Poisson process; these points then determine a family of n random intervals, as 
described above, and we may therefore identify Y n with the largest m such that 
(u m , v m ) lies in the square [0, r(n)] 2 . Since the Poisson process has density 1, we 
will have * 

(1 — e)yfn' < u{n ) < (1 + e)Jn 


with probability at least 1 — e, for sufficiently large n. 

Let m 1 = L(1 - e)(2/ ^)yfn\ and m 2 = 1(1 + s)(2/ / 7r )V^J ; then for large 
enough n , we have that (w mj , v mi ) will lie inside the square [0, r(n )] 2 and (m , v ) 
outside the square, with probability at least 1 - s. We conclude that 


2 Y 2 

(i ~ e )~r < ~r <( 1 + e )~r 

\tt yn V7T 


with probability at least 1 — e, proving the theorem. 

Note: The Poisson process enables us to place the random variables Y n all in the 
same sample space; thus, with the help of the strong law of large numbers, we may 
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obtain the slightly stronger result that 



with probability 1. 

3. A Matter of Degree. In a graph, the degree of a vertex x, denoted d(x), is 
the number of edges incident with x. The minimum and maximum degrees of a 
graph are denoted by 8 and A respectively. What can we say about the minimum 
and maximum degree of a random interval graph? 

The minimum degree question is answered in [8]. We simply repeat the result 
here: 

Theorem 3. If 8 n denotes the minimum degree of a vertex in the interval graph 
generated by n random intervals , then for fixed k > 0, 

lim Pr(d„ < kyfn ) = 1 - exp{ ~k 2 /2 ) . 

n — » oo 

It follows that the average minimum degree approaches yjnir/2 . 

The maximum degree question is more interesting, partly on account of the role 
played by a point of degree n — 1. Recall that the diameter of a graph is the 
maximum over all pairs ( x , y) of vertices of the least number of edges in a path 
from x to y. It is easy to show that with probability approaching 1, the diameter of 
our random interval graph is 2 or 3; and, for an interval graph G, diam(G) < 2 iff 
there is a vertex adjacent to all others. In fact, the following are equivalent for any 
collection of n intervals: 

1. the interval graph G has a vertex of degree n — 1; 

2. G has diameter at most 2; 

3. the interval order P has an isolated point; 

4. there is an interval in the family which meets all the others; and 

5. there is an interval in the family which meets both [u,v] and [p,q], where 
lu,v] is the interval with leftmost right endpoint and [p, q] the interval with 
rightmost left endpoint. 

We are thus moved to ask: What is the limiting probability that in a family of 
random intervals, there is an interval which intersects all the others? 

One normally expects such limits to be either 0 or 1, and in fact in many classes 
of structures (such as the random graphs of Erdos and Renyi) there is a “0-1 Law” 
— in that case proved by Fagin [3] — which guarantees that first-order statements, 
such as “there is a vertex which is adjacent to all others,” must have trivial limiting 
probability. Exceptions to this sort of behavior are sometimes quite startling, as in 
the famous “probleme de rencontres” in which the probability that a random 
permutation has no fixed point approaches 1 /e. 

Here we shall obtain an answer which not only ruins a possible 0-1 Law for 
random interval graphs, but adds insult to injury in a surprising way. 

Let us consider first the Poisson model, except that this time for convenience, 
let the process be of density n on [0, l] 2 . Let S be the minimum, over all points 
( x , y) of the process, of max{x, y}; thus S = max{w, v) where [u, v] is the interval 
with leftmost right endpoint mentioned above. Similarly let R be the position of 
the rightmost left endpoint. Then the “big interval,” i.e., the interval which 
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intersects all others, will exist just when there is a point of the process in the union 
of the rectangles [0, S] X [ R , 1] and [R, 1] X [0, X], This will occur with limiting 
probability 


lim 

n — >oo 


00 ,.00 d Z 

2nxy 


/ / 

J 0 J 0 


dx dy 


(i 


j-nx* _ p —ny x 


+ e-* x2+y2) ) (be dy 


2 

3 ' 


It was when the authors asked themselves how fast the probability converged to 
2/3 that the startling truth emerged: 


Theorem 4. For all n > 1, the probability that in a collection of n random 
intervals there is one which intersects all the others is exactly 2 /3. 

Proof. One can show, using the uniform model on [0, l] 2 , that the above 
probability is 


1 - 4n(n - 1) f 1 f 1 y xy(\ - x 2 - y 2 - 2xy) n 2 -l dxdy, 

J o J o 

which a patient reader will reduce to the constant 2/3. Fortunately, there is a 
relatively painless combinatorial proof, given below, which is both more intuitive 
and more powerful. 

In this proof we employ the “discrete” model, in which integers between 1 and 
2 n are paired at random. Once the intervals have been selected, we label the 
endpoints ^4(1), B( 1), . . . , A ( n — 2), Bin — 2) recursively as follows: 

Refer to the endpoints {1 ,...,«} as the left side, and {n + 1, . . . , 2 n) as the right 
side. Let ^4(1) = n and let B( 1) be its mate. Suppose we have assigned through 
A(j), B(j). We attach the labels A ( j + 1) and B(j + 1) by the following rules: 

• If B(j) is on the left side: Let A(j + 1) be the leftmost point on the right side 
which has not yet been labeled. Let B(j + 1) be its mate. 

• If B(j) is on the right side: Let A(j + 1) be the rightmost point on the left 
side which has not yet been labeled. Let B(j + 1) be its mate. 

If Aij) < B(j ) we say that this interval “went to the right”; otherwise, it “went to 
the left”. Note that we are labeling endpoints of intervals from the center 
outwards, starting from the left when the last interval went to the right, and 
vice-versa. Endpoints marked A( • ) are called inner points and those marked B( • ) 
are called outer. (See Figure 2.) 


Left Side 


f 


5(6) 5(3) 


1 1 i r? 

A(6) A(5) A(3) Ail) Ail) 


Right Side 


i A A o i o o I 

5(2) 5(1) A( 4) A 5(4) A A 3(5) 


Fig. 2. Labeling intervals. 
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It is easy to prove by induction that immediately after the labels A(j) and B(j ) 
have been assigned either: 

1. an equal number of points have been assigned from the left and right sides 
(in case A(j) < B(j)), or 

2. two more points have been labeled on the left than on the right (in case 
A(j) > Bij)). 

Now when the labels A(n - 2) and B(n — 2) have been assigned, there remain 
four endpoints a < b < c < d which are unlabeled. Note that the three ways of 
pairing them are equiprobable. Our claim is that if a is paired with either c or d, 
then that interval intersects all others, but if a is paired with b , then no interval 
intersects all others. This will prove the result. 

We consider the only two possible cases: 

1. a and b are on the left and c and d are on the right, and 

2. only a is on the left. 

In either case, all points labeled with A( • ) are between a and c for otherwise one 
of a or c would have been labeled. It follows that either [a, c] or [ a , d] meets all 
other intervals. 

On the other hand, suppose [a,b] and [c, d] are intervals in the collection. 
Neither is a candidate as an interval which intersects all others since [a, b] n 
[c, d] = 0. Suppose some interval [e, f] (where e </) intersects all others. Sup- 
pose e and / have received the labels A(j ) and B(j). 

In case (1), where a and b are on the left, we know that the endpoint [e, f] 
labeled A(j) is between b and c. Thus [e, f] cannot intersect both [a, b] and 
[ c , d]. 

Now consider case (2), where just a is on the left. Since [e, /] meets [c,d] we 
have / > c, hence / is an outer point if = Bij)). Further, e is an inner point and 
therefore [e, f] went to the right. However, the last labeled pair, {A(n - 2), 
B(n - 2)} must have gone to the left since (in this case) we assigned more labels 
on the left than on the right. Thus, for some k , with j < k < n — 2 we have that 
[A(k), B(k)\ went to the left, but [A{k — 1), B(k - 1)] went right. Thus Aik ) < n 
and A{k) <Aij ) since Aik) is a later-assigned, left-side inner point. It now 
follows that [Bik), Aik)] is disjoint from [Aij), Bij )] = [e,f], a contradiction. 

With slightly more care one may use this construction to show that for any 
k < n, the probability that in a family of n random intervals there are at least k 
which intersect all others is 


2 k 



independent, again, of n. 
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Playing the Numbers 

You can play mathematical sequences as musical notes on your computer. If the 
sequence is periodic, the melody might be quite interesting, and it certainly puts a 
new dimension into the subject. 

For example, fix a modulus m. The sequence of Fibonacci numbers 
0, 1,1,2, 3,5,... (modulo m) 

is periodic. You can listen to the Fibonacci numbers (mod m) by running the 
following little Basic program (enter your favorite modulus m at the prompt). 

10 CLS: INPUT "MODULUS"; M : A = 0: B = 1 

20 C = A + B:C = C- M*INT(C/M) 

30 SOUND 130*2 “(C / 12), 2 : A = B: B = C: GOTO 20 

The tune ‘m = 51,’ for instance, is rather appealing. Try other periodic mathemati- 
cal sequences, scoring several of them together, varying the frequencies and 
durations, etc. See problem E3410 on p. 916 of this issue for information about the 
periods. 


— U. Phonious 



